An Efficient Gene Selection Technique based on Fuzzy C-means and Neighborhood Rough Set
نویسندگان
چکیده
Selecting genes from microarray gene expression datasets has become an important research, because such data typically consist of a large number of genes and a small number of samples. Avoiding information loss, neighborhood mutual information is used to evaluate the relevance between genes in this work. Firstly, an improved Relief feature selection algorithm is proposed to create candidate feature subsets. Then, the cohesion degree of the neighborhood of an object and coupling degree between neighborhoods of objects are defined based on neighborhood mutual information. Furthermore, a new initialization method of cluster centers for the Fuzzy C-means (FCM) algorithm is proposed. FCM is a method that allows one piece of data to belong to two or more clusters. Moreover, in view of neighborhood rough set is an effective tool to extract and select features, a novel algorithm for gene selection based on FCM algorithm and neighborhood rough set is proposed. Finally, to evaluate the performance of the proposed approach, we apply it to five well-known gene expression datasets. Experimental results show that the proposed approach can select genes effectively, and can obtain high and stable classification performance.
منابع مشابه
Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets
With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...
متن کاملRough approximation operators based on quantale-valued fuzzy generalized neighborhood systems
Let $L$ be an integral and commutative quantale. In this paper, by fuzzifying the notion of generalized neighborhood systems, the notion of $L$-fuzzy generalized neighborhoodsystem is introduced and then a pair of lower and upperapproximation operators based on it are defined and discussed. It is proved that these approximation operators include generalized neighborhood system...
متن کاملFuzzy and Rough Set Theory Based Gene Selection Method
The selection of genes from microarray gene expression datasets has become an important research in cancer classification because such data typically consist of a large number of genes and a small number of samples. In this work, Neighborhood mutual information is retrieved to evaluate the relevance between genes and is used to stop information loss. Firstly, an improved Relief Feature Selectio...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملFeature subset selection based on fuzzy neighborhood rough sets
Rough set theory has been extensively discussed in machine learning and pattern recognition. It provides us another important theoretical tool for feature selection. In this paper, we construct a novel rough set model for feature subset selection. First, we define the fuzzy decision of a sample by using the concept of fuzzy neighborhood. A parameterized fuzzy relation is introduced to character...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014